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ABSTRACT 

We present a method to numerically estimate the densities of a discretely sampled data based 
on a binary space partitioning tree. We start with a root node containing all the particles and 
then recursively divide each node into two nodes each containing roughly equal number of 
particles, until each of the nodes contains only one particle. The volume of such a leaf node 
provides an estimate of the local density and its shape provides an estimate of the variance. We 
implement an entropy-based node splitting criterion that results in a significant improvement 
in the estimation of densities compared to earlier work. The method is completely metric free 
and can be applied to arbitrary number of dimensions. We use this method to determine the 
appropriate metric at each point in space and then use kernel based methods for calculating 
the density. The kernel smoothed estimates were found to be more accurate and have lower 
dispersion. We apply this method to determine the phase space densities of dark matter halos 
obtained from cosmological N-body simulations. We find that contrary to earlier studies, the 
volume distribution function v{f) of phase space density / does not have a constant slope 
but rather a small hump at high phase space densities. We demonstrate that a model in which 
a halo is made up by a superposition of Hernquist spheres is not capable in explaining the 
shape of v{f) vs / relation, whereas a model which takes into account the contribution of the 
main halo separately roughly reproduces the behavior as seen in simulations. The use of the 
presented method is not limited to calculation of phase space densities, but can be used as 
a general-purpose data-mining tool and due to its speed and accuracy it is ideally suited for 
analysis of large multidimensional data sets. 

Key words: methods: data analysis - methods: numerical - cosmology: dark matter-galaxies: 
halos-galaxies: structure 



1 INTRODUCTION 

One of the basic problems in data mining is to estimate the proba- 
bility distributions or density distributions based on a discrete set of 
points (particles) distributed in a multidimensional space. Once the 
density distribution is known expectation values of other quantities 
of interest can be derived. Considering the huge amounts of data 
both astronomy and other fields are facing there is a need for meth- 
ods that are accurate flexible and fast. However, most of the existing 
methods encounter problems when applied to higher dimensions. 
In the particular application of N-body simulations, the estimate of 
phase space densities is one such problem as it requires an efficient 
and flexible method for 6 dimensional phase space density estima- 
tion for a large variety of equilibrium and non-equilibrium solutions 
of largely different topology (e.g. highly flattened disks, spheroidal 
but anisotropic halos, spheroidal nearly isotropic ellipticals). 
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The simplest method for density estimation is the k nearest 
neighbor. Consider the radius r enclosing k nearest neighbors then 
density is given by fc/Vd (r) where V d(r) is the volume enclosed by 
a d-d imensional sphere of radius r jLoftsg aarden & Quesenberrvl 
more accurate method than this is the kernel density esti- 
mation (KDE) or popularly known as SPH dOingold & MonaghanI 
1 1 977l : ILucvII 1 977l : Isilverma j| 1 9861) . The results are sensitive to the 
choice of kernel function and the bandwidth of the kernel or in 
other words the number of smoothing neighbors. The later being 
more important. Variable bandwidth estimators are more superior 
as compared to the fixed bandwidth estimators. For the multidimen- 
sional case simple isotropic bandwidths perform poorly when the 
data has an anisotropic distribution. In this case one needs to se- 
lect different bandwidths in different dimensions. In general a co- 
variance matrix is determined and the bandwidth is selected so as to 
have constant co- variance in all directions. This leads to anisotropic 
kernels. The Delaunay tessellation ( Okabe et al. 1992; Okabe 2000|; 
Schaap & van de Weygaerll I2OO0I : Bemardeau & van de Weygaertl 
1996h which tessellates space into disjoint regions, performs much 
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better for anisotropic data. Delaunay tessellation is very accurate 
but also very time consuming. 

Most existing methods, including both KDE and Delaunay 
Tessellation require an a-priori definition of a metric of the n- 
dimensional space under investigation. A suboptimal choice of 
metric results in a poor estimate of the density. Metric based 
density estimators provide optimal approximations, only if co- 
variance of the data is identical along all dimensions, locally at each 
point in space. In general, however, data is non-homogeneous and 
anisotropic. Consequently the above conditions cannot be realized 
by assuming a global scaling relation among different dimensions. 
A method is required that is adaptive to the data under investiga- 
tion. Recently a new method dubbed FiEstAS w hich is metric free 
has been proposed bv lAscasibar & BinnevI ( |2005|) . FiEstAS is also 
very fast and efficient. The method relies on a repeated binary de- 
composition of space (organized by a tree data structure) until each 
volume element contains exactly one particle. The accuracy of the 
method depends upon the criteria used for splitting the nodes. In 
the simplest implementation the dimension to be divided is chosen 
either randomly or alternately, guaranteeing equal number of divi- 
sions for each dimension. The more a particular dimension is tes- 
sellated the higher the resolution achieved in that dimension. Ide- 
ally we need a scheme which makes more divisions in the dimen- 
sion along which there is maximum variation and few divisions (or 
none) along which there is minimum (or no) variation. However, 
the scheme as described above is data blind and thus fails to opti- 
mize the number of divisions to be made in a particular dimension. 

In this paper we propose and evaluate a splittin g criterion 
that i s based upon the concepts of Information Theory jShannon"] 
1 194811 19491 : iMacKav l[2003l : lGershenfeld]|l999l) . Space is tessel- 
lated along the dimension having the minimum entropy (Shannon 
Entropy) or in other words maximum information . Consequently, 
this scheme optimizes the number of divisions to be made in a par- 
ticular dimension so as to extract maximum information from the 
data. This method can also be used to determine the metric that lo- 
cally gives approximately constant covariance. Kernel based meth- 
ods can then be used to estimate the densities. 

As an application we study the phase space density of dark 
matter halos obtained from cosmological simulations. The code is 
available upon request and in future we plan to make it publicly 
available at the following url 
https://sourceforge.net/projects/enbid/ 



2 ALGORITHM 

The basic problem is to estimate the density function p(x) from a 
finite number of data points x""^, x'^..x'^ drawn from that den- 
sity function. Here x' is a vector in a space of d dimensions hav- 
ing components x\,x\..x]j^. The overall procedure of our algorithm 
EnBiD (Entropy Based Binary Decomposition) consists of three 
steps, which we will describe in detail below. First we tessellate 
the space into mutually disjoint hypercubes each containing exactly 
one particle. If is the volume of the hypercube containing zth 
particle then its density is mi/V . Second we apply the boundary 
corrections to take into account the arbitrary shape of the volume 
containing the data. Third we apply a smoothing technique in order 
to reduce the noise in the density estimate. 



2.1 Tessellation 

We start with a root node containing all particles. The node is di- 
vided by means of a hyper plane perpendicular to one of the axis 
into two nodes each containing half the particles. If j is the dimen- 
sion along which the split is to be performed, the position of the hy- 
per plane is given by the median of Xj. The process is repeated re- 
cursively till each sub-node contains exactly one particle( so-called 
leaf nodes). Let Vi be the volume of the leaf node containing par- 
ticle i, and nn be the particle mass, then the density is given 
by Pi = irii/Vi. An alternative to this, as was originally done 
in FiEstAS, is to calculate the mean < Xj > and then iden- 
tify two points one on each side which are closest to the mean. 
The split point is then chosen midway between these two points. 
Xcut ~ {xieft + Xright)/^. TMs spceds up the tessellation. 

In the implementation of FiEstAS the splitting axis alternates 
between the considered dimensions, which guarantees roughly 
equal number of divisions per dimension. In the calculation of 
phase space densities the real and velocity space are known to be 
Euclidean. Therefore the splitting is done alternately in real and 
velocity space and in each subspace the axis with highest elonga- 
tion (< x'^ > — < Xj >^) is chosen to be split. This generates 
cells that are cubical rather than elongated rectangular in the afore- 
mentioned subspace, and also helps alleviate numerical problems 
that arise when two points have very close values of a particular 
co-ordinate. We call this decomposition which is implemented in 
FiEstAS as Cubic Cells while the one free from this as General. 

For A*' particles the binary decomposition results in 2N — 1 
nodes out of which there are TV leaf nodes each having one par- 
ticle. The more a particular dimension is tessellated, the more the 
resolution in that dimension. However, for data that is uniformly 
distributed in a particular dimension there is actually no need to 
perform a split in that dimension. This fact can be exploited to in- 
crease the accuracy of the results. 

For each node we calculate the Shannon entropy Sj along each 
dimension (or subspace) and then select the axis (subspace) with 
minimum entropy. The dimension having minimum entropy guar- 
antees maximum density variation or clustered structures in that 
dimension. In other words we split the dimension that has the max- 
imum amount of information. The entropy S along any dimension 
or subspace is estimated by dividing the dimension or subspace 
into Nf, bins of equal size and calculating the number of points rii 
in each bin (we choose N], to be equal to the number of particles 
in each node). The probability that a particle is in the ith bin is 
given by Pi = rii/N where TV is the total number of particles. The 
entropy is then given by 

n 

^ = -^paog(p.) (I) 

Rather than treating each dimension independently it is also 
possible to select a subspace (real or velocity space) with minimum 
entropy and then choose an axis with maximum elongation from 
this subspace {Cubic Cells). This provides slightly lower dispersion 
in estimated densities. 

2.2 Boundary correction 

The data in general might have an irregular shape and may not 
be distributed throughout the rectangular volume of the root node. 
Consequently, the densities of particles near the boundary can be 
underestimated. This is not an issue for systems with periodic 
boundary conditions but it would be for systems which are for 
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example spherical. In higher dimensions this correction becomes 
even more important since the fraction of particles that lie near the 
boundary increases sharply with number of dimension|3- 

In FiEstAS the following correction is implemented: Suppose 
a leaf node having a particle at x'' has one of its surfaces in dimen- 
sion i either Xi = Xmax or Xi — Xmin as a boundary, then the 
boundary face is redefined such that its distance from the particle 
is same as the distance of the other face from the particle. For the 
former the redefinition is Xi — x^ + {x^ — a;™'") and for the later 
Xi = x^ — (a;'""^ — ). If both the faces lie on the boundary then 
the scheme fails to apply the correction. Moreover for small sub 
halos embedded in a bigger halo the sub halos have lower velocity 
dispersion and occupy a smaller region in velocity space, hence its 
boundary needs to be corrected even though it is not directly de- 
rived from the global boundary. A similar situation also arises near 
the center of the halos for which the circular velocity Vc{r) — > 
as r 0. Moreover in EnBiD we need to calculate the entropy for 
each node and the boundary effects might decrease the entropy of 
the system spuriously. Consequently, a boundary correction needs 
to be applied to each node during the tessellation, and not just to the 
leaf node at the end of tessellation. In EnBiD for each node having 
more than a given threshold Ub of particles, a node is checked for 
boundary correction before the calculation of entropy. In a given 
dimension if Imax and Imin are the maximum and minimum co- 
ordinates of a node and Xmax and Xmin, the corresponding max- 
imum and minimum co-ordinates of the particles inside it, then a 
boundary correction is applied if simultaneously 



and 



>fb 



(2) 



\-^min ^minj Jb Z yJ) 

^node -t 

where is a constant factor. This is effective in detecting embed- 
ded structures. To check for corrections applicable for only one 
face, the value of is chosen to be 5 times higher. For cubical cells 
in real and velocity space = 0.5N^'''^ was found to give optimal 
results, where is the total number of particles in the system. For 
general decomposition the corresponding value of ft was found to 
be 2.0^^^^^*. The node boundary Imax and Imin are corrected as 



^max 



'^node ^ 
J Xmax Xmin 

IT'node -L 

where {xmax — Xmin)/{nnode — 1) IS the cxpccted mean inter- 
particle separation. 

The choice of Ub is dictated by two factors a) if d is the number 
of dimensions of the space then a minimum of d + 1 particles are 
needed to define a geometry in that space so we set jij, d + 
1. If the number of particles in a node are too small this leads to 
Poisson errors in the calculation of the inter-particle separation so 
we impose a lower limit of rib = 7. 



2.3 Smoothing 

The un-smoothed density estimates have a large dispersion which 
cannot be reduced even by increasing the number of particles. By 
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smoothing this dispersion can be reduced provided the density does 
not vary significantly over the smoothing region. We test two differ- 
ent smoothing techniques. The FiEstAS smoothing as proposed by 
(_Ascasibar & Binney 2005) and the kernel based scheme (KDE). 

In FiEstAS smoothing, first the density of each node is cal- 
culated assuming that the mass of each particle is distributed uni- 
formly over its leaf node. Next the volume Va centered on that point 
which encompasses a given smoothing mass Ms is calculated. The 
density estimate is then given by p = Ms/Vs - For Cubic tessella- 
tion the smoothing cells are also chosen to be exactly cubical in the 
real and velocity subspaces. To calculate Vs an iterative procedure 
is used. We start with a hyper-box having boundaries in the i-th di- 
mension at Xi±Ai, Ai being the distance to the closest hyper plane 
along i-th axis of the leaf node containing the point x. Ai is then 
doubled until the mass enclosed by smoothing box M < Ms and 
then the interval is halved repeatedly till \ {M — Ms)/Ms\ ^ rjtoi 
where rjtoi is a tolerance parameter. Our experiments show that a 
tolerance parameter of 0. 1 gives satisfactory results. Although in 
FiEstAS the smoothing mass Ada = lOm^ is chosen, we find that 
choosing Ms — 2mp gives a higher resolution, while not compro- 
mising much on the noise reduction. 

In Kernel smoothing a fixed number of nearest neighbors 
around the point of interest are identified and the density is com- 
puted by summing over the contributions of each of the neighbors 
by using a kernel function. This is known as the adaptive kernel 
smoothing since the smoothing length is oc p^^'', p being the den- 
sity in a d dimensional space. The kernel function can be spherical 

of the form of W{u], u = ^X^iLi ^ being the distance of the 
neighbor from the center and Ui the corresponding co-ordinates in 
a d dimensional space, or of the form of ni=iW^(iti) known as 
the product kernel. The standard kernel scheme provides a much 
poorer estimate of the phase space density, since a global metric 
is usually unsuitable in accounting for the complex real and ve- 
locity structure encountered in many astrophysical systems. How- 
ever, with a method like EnBiD we can determine the appropriate 
metric at each point in space and thus force the co-variance to be 
approximately same along all dimensions. At any given point the 
correct metric can be calculated by determining the sides of the 
leaf node which encompasses that point, followed by a coordinate 
transformation such that the node is transformed into a cube. As 
we illustrate in the appendix, the kernel density estimator can have 
a significant bias in the estimated densities. The results we show 
here are after correcting for this bias. We tested and compared the 
use of spline and the Epanechnikov kernel function and found the 
later to be more efficient. For all our analysis we use the Epanech- 
nikov kernel function. Bias correction and other details pertaining 
to kernel based methods e.g the number of smoothing neighbors 
are given in the Appendix. The algorithm implemented in EnBiD 
for nearest ne ighbor search is based on the algorithm of SMOOTH 
( IStadelll995l) . 

Although the length of the sides of a node provides an accurate 
estimate of the metric but when trying to smooth over a region, 
the smoothing region might exceed the boundaries of the actual 
particle distribution. The smoothing lengths in such case needs to 
be appropriately redefined. This situation arises in cases where a 
dimension has very less entropy and has been split many times or 
near the boundaries of the system where the metric has not been 
accurately determined. In a given dimension let Imax and Imin be 
the maximum and minimum co-ordinates of a smoothing box or a 
sphere encompassing a fixed number of neighbors Nngb and Xmax 
and Xmin the maximum and minimum co-ordinates of the particles 
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inside it. A smoothing length correction is applied to the box, if 
simultaneously the distance to both the right and left boundaries 
given by 



{I 
{x 



max ^max ) > 25{x 



)/N^gb (6) 

)IN^3b (7) 

where N^gb is the number of smoothing neighbors. The metric is 
redefined with Imax and Imin set to Xmax and Xmin- For FiEs- 
tAS smoothing also we implement a similar smoothing volume 
correction. For a given smoothing box of volume Vs, if is 
the mass contributed by a leaf node to the smoothing box and 
Vi its corresponding volume that falls within the box, then in- 
stead of calculating the density as p = ^mi/l/s, we calculate 
it as p = (X/ "0- This correction is only applied if 

(E^^O/v; <o.5. 



3 TESTS 

To test the accuracy of the results we generate test data with a 
given density distribution in a d dimensional space and then per- 
form a comparison with the density estimates given by the code. 
We employ systems which have an analytical expression of 6 di- 
mensional phase space density /, nam ely an isotropic Hemquist 
sphere (c.f. lAscasibar & BinnevI | |2005|) ) and an isotropic ha lo with 
a Maxwellian velocity distribution (c.f. ' Arad et al] j2004h ). The 
test cases are generated by discrete random sampling of this den- 
sity function / using a fixed number of particles A*". We show here 
results of tests done in 6 dimensions only and with boundary cor- 
rection and smoothing. Results pertaining to 3 dimensions and ef- 
fe cts of boundary correction a nd smoothing are discussed in detail 
in lAscasibar & BinnevI ( l2005h . 



3.1 Hernquist Sphere 

For a lHemguisj jl990l) sphere of total mass M and scale length a 
the real space density is given by 



p(r) = 



M/(27i 



{r / a){l + r / a)'i 

and gravitational potential is given by 

GM 1 
r) = — . 

' a 1 + r/a 

The phase space density as a function of energy E — v'^ /2 + 

Mr) is 



S{E) = 



where 



47r3(2GM/a)3/2 



3sin-i q + q^l - g2(l - 2q^){9,q^ - Sg^ - 3) 
(l-g2)5/2 



(8) 



(9) 



E 



GM/a' 



First we generate a random realization in real space corre- 
sponding to density given by Eq. ([Sj. Then we use von Neumann 
rejecti on techniqu e to generate the velocities that sample the distri- 
bution |Presset2] ( [1992") . 

p(v)dv = — -/(u /2 + ^(r)Wdv 
p{r) 
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Figure 1. Dependence of fraction f/ft on ft and v{f) and ct{f) on / 
for a Hemquist sphere with A'^ = 10*' particles obtained by different algo- 
rithms for density estimation. Vertical dotted lines mark the position where 
f/ft = 0.5. EnBiD resolves the high-density regions better by about 2 
decades in density. Kernel Smoothing using the metric as determined by 
EnBiD performs even better (a gain in resolution of about 3-4 decades). 
Using a smaller number of smoothing neighbors results in higher resolu- 
tion. 



Further details can be found in lAscasibar & BinnevI ( l2005h . For 
calculating the virial quantities of a Hemquist sphere we use c = 
Rvir/cL = 4.0 which roughly corresponds to an NFW halo with 
c = 8.0. 

In top panel of Fig. [T] we plot the ratio of numerically esti- 
mated phase density / evaluated by the respective method to the 
analytical phase space density /t , as a function of ft for a Hern- 
quist sphere sampled with lO" particles. / is calculated by binning 
the particles in 80 logarithmically spaced bins in ft with at least 
5 particles per bin and then evaluating the mean value of the esti- 
mated density of all the particles in the bin. 

Ideally one expects the plot to be a straight line with f/ft = 1. 
It can be seen from the figure that the density is well reproduced 
for most of the halo for about 18 decades in density except near the 
very center where the density is very high. Both FiEstAS and En- 
BiD tessellation, followed by FiEstAS smoothing with Ms = 2mp, 
underestimate the density in the region of very high density, how- 
ever when compared to FiEstAS tessellation the high density cusp 
is resolved better by EnBiD by about 2 decades in density. In real 
space there is more variation of density as compared to velocity 
space. EnBiD accounts for this by allocating more divisions in real 
space thereby achieving higher spatial resolution, whereas FiEstAS 
gives equal weight to both spaces and ends up thus compromising 
the spatial resolution. When kernel smoothing is employed along 
with metric as determined by EnBiD tessellation (EnBiD-l-Kemel 
Smooth), there is a further gain in resolution by about 3 and 4 
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x = log(fA) 

Figure 2. Probability distributionP(log(///t)) for a Hemquist sphere 
with N = 10^ particles obtained by different algorithms for density es- 
timation. 



decades for smoothing neighbors n — 40 and n = 10 respectively. 
Lowering the number of smoothing neighbors results in higher res- 
olution. 

Next we compare the volume distribution function v{f) as 
reproduced by the code. Numerically v{f) is evaluated by bin- 
ning the particles as before in logarithmically spaced bins of /. 
If rribiri is the mass of all the particles in the i th bin, the den- 
sity of the bin being ftin = (/j+i + fi)/2 then v{fbi„) = 
{jribin / .fbin) / {.fi+i — fi)- Statistical error in each bin is given by 
^/ = fbin— < fbin > (where < fbi„ > is the mean value of 
density of all the particles in the bin). Analytically the volume dis- 
tribution function is given by 



[/(£)] = 



9(E) 
f'{E) 



(10) 



where g{E) is the density of states. For a Hernquist sphere 



giE) 



2Tv'^a^{2GM/a 



,1/2 
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3g5 

2n1/2/. 2 



[3{8q'^ - 4q^ + l)cos'^q 

i)(2g'+3)] 



It can be seen from middle panel of Fig.[T]that v{f) is well 
reproduced by both FiEstAS and EnBiD. However, in the high den- 
sity region FiEstAS underestimates v{f) which results in steepen- 
ing of the volume distribution function at the high / end, while 
EnBiD estimates the «(/) accurately to much higher densities. 

This can be seen more clearly in lower panel of Fig. [T] where 
we plot the logarithmic slope denoted by a of the volume distribu- 
tion function as function of density /. 

^ dlog{v{f)) 
" dlog(/) 

FiEstAS can reproduce the slope parameter a only till f /fvir ~ 
10^ whereas EnBiD can reproduce it till f / fvir = 10* and En- 
BiD-l-Kernel Smooth can reproduce it till f/fvir ~ 10^ and 
f /fvir = 10®, for smoothing neighbors n = 40 and 10 respec- 
tively. 

In order to get an estimate of the dispersion in the reproduced 
values of / and in order to check the effectiveness of smoothing we 
plot in Fig. |2] the probability distribution of f / ft - The distribution 
can be fitted with a log-normal distribution and the fit parameters 
are also shown in the figure. The bias is less than 0.03 dex for 
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Figure 3. Dependence of fraction f /ft on ft for a Hernquist sphere with 
A'^ = 10® particles obtained by different algorithms for density estimation. 



all the methods. The un-smoothed estimates have a dispersion of 
0.37 dex. FiEstAS smoothing with Ms = 2 is equivalent to Kernel 
smoothing with smoothing neighbors n — 40. Both of them have 
a dispersion of about 0.1 dex. For kernel smoothing lowering the 
smoothing neighbors to n = 10 results in an increase in dispersion 
to 0.18 dex. 

The EnBiD tessellation in the results as analyzed above was 
done with Cubic Cells in real and velocity space. In top right panel 
of Fig. [3] we compare the results as obtained with General decom- 
position where each dimension is treated independently. Kernel 
smoothing with smoothing neighbors n = 10 was employed for 
both of them. The estimates are nearly identical. There is a slight 
gain in resolution but the estimates with General decomposition 
were also found to have a slightly higher dispersion in the esti- 
mates. In bottom right panel we compare the result of smoothing 
between a product kernel and a spherical kernel. There is very little 
difference between the estimates. The number of neighbors were 
chosen so as to have identical dispersions in both the estimates. 
When using the kernel in product form about double the number of 
neighbors are needed to obtain identical dispersion. 

In top left panel we compare the un-smoothed densities with 
FiEstAS smoothed densities. For both of them EnBiD scheme is 
used for tessellation. The un-smoothed estimates are the densities 
as determined from the volume of the leaf nodes generated by the 
tessellation procedure. The FiEstAS smoothing only reduces the 
dispersion the resolution remains nearly unaltered. The resolution 
and accuracy is essentially determined by the density of the leaf 
nodes. N ext we compare the FiEstAS smoothing with cloud in cell 
scheme jHocknev & Eastwood! 1 1981]) of density estimation. The 
cloud in cell (CIC) method of density estimation is a special case of 
smoothing with a product kernel along with a linear kernel function 
W{u) oc ( 1 — It) . Although the FiEstAS smoothing is similar to the 
cloud in a cell scheme of density estimation but is still unique in its 
own respect. The main difference being that the clouds which are 
the leaf nodes in case of FiEstAS smoothing are disjoint whereas 
in cloud in cell scheme or in general for Kernel based schemes they 
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Table 1. Comparison of time needed to calculate densities by various meth- 
ods: This is the time taken to calculate the six dimensional phase space 
density of a Hernquist sphere with 10'' particles on an AMD XP2000-I- pro- 



cesser having 


a clock speed of 1666.67 MHz 








Method 




Tree 


Smooth 


Total 


Tessellation 


Smoothing Building 






AD Delaunay 








Iweek 


AB FiEstAS 


FiEstAS Ms = lOmp 


4s 


730s 


724s 


FiEstAS 


FiEstAS Ms = lOrrip 


8s 


522s 


532s 
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are overlapping. They can smooth over much smaller regions and 
hence achieve higher resolution as compared to FiEstAS smooth- 
ing. In bottom left panel we plot the estimates of FiEstAS smooth- 
ing alongside the estimates as obtained with product kernel with 
smoothing neighbors n=18. Instead of a linear kernel function we 
use the Epanechnikov kernel. It can be seen from the figure that the 
resolution achieved with product kernel is higher as compared to 
that of FiEstAS smoothing. 

When decomposition was done alternately in each dimension 
the median criterion gave more accurate results. However for En- 
BiD decomposition choosing the splitting point at either the mean 
or the median both gave similar results for density estimation of a 
hernquist sphere, but for a system having substructures the mean 
criterion gave better results. For all our analysis unless otherwise 
mentioned, for evaluating phase space densities we use EnBiD de- 
composition with Cubic Cells to determine the metric and then use 
the method of spherical kernel smoothing for calculating densities. 
The mean criterion is used for choosing the splitting point. The 
number of smoothing neighbors n is chosen to be 40, although 
choosing n = 10 gives higher resolution but it also has higher 
dispersion which means that the volume distribution function will 
be smoothed out below the scale set by the dispersion (see i i3.2l for 
more explanation). 

In Table. [T] we compare the CPU time needed to estimate 
the phase space density of lO'^ particles in a Hernquist sphere 
by various methods and techniques. The time as reported by 
lAsca sibar & B innevI ( |2005|) for FiEstAS is la beled as AB FiEstAS 
and the time as reported bv lAradetan ( l2004) for Delaunay Tessel- 
lation method as AD Delaunay. It can be seen that most of the time 
is needed for smoothing. For both FiEstAS and Kernel smoothing, 
increasing the smoothing mass or the number of smoothing neigh- 
bors, increases the time. Our implementation of FiEstAS smooth- 
ing is slightly faster as compared to that of lAscasibar & BinnevI 
( |2005[) due to better cache utilization. This is achieved by or- 
dering the particles just as they are arranged in the binary tree. 
The kernel smoothing which gives more accurate results requires 
a modest 20% more time as compared to the time reported in 
lAscasibar & BinnevI ( l2005h for FiEstAS. For median splitting it is 
possible to speed up the neighbor search by about 10%. 



3.2 Maxwellian Velocity Distribution Models 

For these models the phase space density is given by 

f{r,v) = p(r)[27ro-(r)^]-^/'e"'/^"''')' (11) 
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Figure 4. The cumulative distribution of f / ft as measured in different 
bins of ft for three different mock systems. The density is progressively 
overestimated in low density regions. 



where p(r) is the real space density given by 

„-''/(5r-.) 



^('■^ (r/r.)"(l + r-/r,)3— 

The velocity dispersion is assumed to be either constant with 
<Jc(r) — 0.1 or variable with a^{r) — M{r)/r. We generate 
models with a = and a = 1. 

The volume distribution function v{f) for such systems is 
given by 



where 



p{r) 



rV(r)%/2 log ^^dr 



(12) 



(13) 



[27ra(r)2]3/2 

In Fig.|4]we show the volume distribution function as recov- 
ered by EnBiD along with kernel smoothing for three different 
models \) a = G,(Tc 2) a = 1,(Tc and 3) a = 1, (t„ and with three 
different particle resolutions iV = lO" , iV = 10^ and = 10*^. 
For the highest resolution the volume distribution can be recov- 
ered for about 9 to 13 decades in /. The range of densities over 
which the v{f) is reliably recovered increases with increasing par- 
ticle number. For systems with a sharp transition in slope of v{f) 
for example a = Q,Oc system, Delaunay Tessellation was found 
to significantly over-estimate v{f) (Fig-A2 Arad et al. (2004)), be- 
cause the measured v{f) can be thought of as a convolution of the 
exact Vt{f) with a fixed window function p{f /ft)- The narrower 
thep(///t) the closer is v{f) to vt{f). If Vt{f) varies significantly 
over scales smaller than the width of p{f / ft) the shape of recov- 
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Figure 5. Comparison between density estimators EnBiD and FiEstAS in 
extracting the volume distribution function and the slope parameter a of a 
ACDM halo. The vertical dotted lines mark the position of / = fdiacr ^nd 
/ = Uelax (fdiacr being less than f relax)- The solid vertical hne marks 
the point where statistical errors ( Af / f > 0.1 in a bin) in calculation 
of v{f) become important (dashed one for FiEstAS) . The estimator FiEs- 
tAS fails to resolve the high density regions accurately and this results in 
steepening of the v{f) profile. 



ered «(/) will be affected. The v{f) will be over-estimated for a 
system with a sharp change in the slope of v{f). Moreover due to 
the width of p{f /ft) the effective cut-off value of / is also higher 
as compared to the theoretically expected upper bound. A bias in 
p{f/ ft) will also affect the results. Delaunay Tessellation estimates 
have a width of about one decade in the distribution of p{f /ft)- 
With EnBiD (using smoothing neighbors n = 40) for a = 0,ac 
system at the high / end there is very little width in the recovered 
values of /, this is the reason that v{f) is recovered better by En- 
BiD as compared to Delaunay Tessellation (Fig.|4l(. For other sys- 
tems the range of / over which v{f) is recovered is slightly higher 
for EnBiD (using smoothing neighbo rs n = 10) as co mpared to 
Delaunay Tessellation (Fig-A2 and AS lArad et all j2004) ). 



4 PHASE SPACE STRUCTURE OF DARK MATTER 
HALOS 

We are now applying our tools to the phase space structure of 
viria lized dark matter halos in a concordance ACDM universe 
JSpergel et al. 2003; Melchiorri, Bode, Bahcall, & Silk 2003). The 
structure of these halos in real space has been studied in 
great detail over the past decade and the radial density pro- 
file is known to follow an almost universa l form known as the 
NEW ( iNavarro. Frenk. & Whitell 199611 1997h profile (see however 



iNavarro et all ( |2004|) for a new a profile). 

p{r) = 



(r/r,)(l + r-/r,)2 



(14) 



The dark matter particles are coUisionless and obey the collision- 
less Boltzmann equations. For a coUisionless spherical system in 
equilibrium with a given density profile p{r) the phase space 
density /(r, v) can be cal culated using the Eddington equation 
teinnev & Tremaine II 19871) . 



fie) = 



d? p dtp 



dTp^ — ■(/) 



dp 

dip 



V)=0 



Since / is a function of six variables it is hard to study ex- 
cept in cases where there are isolated integrals of motion which 
reduce the number of independent variables. To study the structure 
of phase space density, the function «(/) is introduced which is the 
volume distribution function of /. v{f)df is the volume of phase 
space occupi ed by phase spac e elements having density between 
/ to / + df. lAradetan ( l2004l) calculated the phase space density 
using Delaunay Tessellation in 6 dimensions and studied the vol- 
ume distribution function of halos obtained from simulations. They 
found that v{f) follows an almost universal form which is a power 
law with slope —2.5 ± 0.05 which is valid for about four decades 
from fvir to fvirW^. fvir IS an estimate of the phase space den- 
sity in the outer parts of the halo. 



/v 



PVi: 



3Ape 



47r4G3 



1/2 



Mv 



1.64 X lO^h^ Mq kpc-^( kms" 



Using A = 101 



(Mvir/MQh-^) 

This behavior was also found to be independ ent of redshift and 
the mass of the halo. lAscasibar & BinnevI |2005|) used the FiEstAS 
algorithm to calculate the phase space densities and confirmed the 
above result and in addition found slight deviations both at low and 
high / end. At the low / end (near fvir) the slope was found to be 
flatter than —2.5 and at the high / end it was found to be signifi- 
cantly steeper. At the high / end there are two relevant numerical 
phase space densities, above which two-body relaxation and dis- 
creteness effects in simulations start dominating. The phase space 
density above which the tw o body relaxation is s horter than the age 
of the universe is given by jDiemand et alj|2004l) 

0.34 1 

Jrelax 



(27r)3/2G2 InArripto 
1.94 X W'^h'^ Mq kpc"^( kms" 
(mp/Me/i-i) 



(15) 



(16) 



The above value is obtained by assuming a Coulomb logarithm of 
In A = 6 and using to ~ 14.5Gyr as the age of the Universe. The 
ph ase space dens ity, above which the discreteness effects discussed 
bv lBinnevl ( l2004l) become important, is 



fd 



H^rup 

6.93 X 10* /i^ Mq kpc-='( kms~ 
imp/ Mq/i-1) 



(17) 



(18) 



Since the steepening was found to r oughly coincide with t hese d en- 
sities, this effect was attributed bv I Ascasibar & BinnevI ( |2005|) to 
the numerical effects of the simulations. 
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We analyze 5 halos at 2: = simulated in a ACDM cosmology 
with JIa = 0.7, Q.m = 0.3. To evaluate the phase space densities 
we use the EnBiD scheme along with kernel smoothing employing 
n = 40 neighbors. Halos A, B, and C were isolated from a cosmo- 
logical simulation of 128"' dark m atter particles in a 32.5/i^^ Mpc 
cube performed by AP^M code jCouchmanll 1 99 ih and were then 
re-simulated at highe r resolution from 2: = 50 to z = using 
the code GADGET (Springel . Yoshida. & Whitll200lh . Halo A' 
is a warm dark matter (WDM) realization of halo A which was 
generated by suppressing power on scales smaller than the size 
of the halo. Halo D i s from a simulation done with an ART code 
jKravtsov et alj[l997h with a box size of 80/i~^ Mpc. Further de- 
tails are given in Table. |2l For calculating phase space densities we 
use the EnBiD tessellation scheme and smoothing is done with a 
spherical kernel employing n = 40 neighbors. 

It can be seen from Fig.|5]that at the high / end there are dif- 
ferences between the phase space properties of halos as reproduced 
by EnBiD (kernel smoothing using 40 neighbors) and FiEstAS (Fi- 
EstAS smoothing using smoothing mass Ms = 2mp). We argue 
that the steepening of the v olume distribution function as found by 
lAscasibar & BinnevI ( 12005 ) is probably an artifact of the FiEstAS 
algorithm since such a steepening also appears in tests done with a 
pure Hernquist sphere (i ]3.1t . For EnBiD we do not see such steep- 
ening; on the contrary, we see a slight hump. This however does 
not preclude the association of discreteness and relaxation effects 
with the phase space structure of halos. Since we do not know the 
real phase space density of the halo it is difficult to disentangle any 
such effect from the effect of the estimator. For a WDM halo whose 
profile we expect to be the same as that of a Hernquist sphere we 
do see a sudden change in slope at around f relax (Fig. 1 10b. Also 
the slope parameter of ACDM halos have a maximum which is 
around fdiscr and beyond this it starts to fall off Fig.jS] At the low 
/ end the flatness in «(/) profile is partly due to the truncation 
of the halo at a finite radius r — Rvir- This is demonstrated in 
Fig. |6] where for a synthetic Hernquist sphere with Rcut ~ Rvir 
the v{f) profile is found to flatten out beyond / — fvir and a{f) 
rises sharply. The cosmological halos exhibit a flattening that is 
more pronounced than the synthetic halos. This suggests that their 
structure is slightly different from that of an equilibrium spheri- 
cal model corresponding to a given density profile. Models with 
anisotropy in the velocity dispersion also do not seem to suggest 
any extra flatteni ng of the v(f) profile. One possibility which was 
suggested earlier jAscasibar & Binnevl2005t) was that this could be 
due to depletion of low density phase space by the presence of high 
density sub halos co-occupying the same space. This can be ruled 
out as the low / behavior of a WDM halo that does not exhibit 
significant substructure is identical to that of a ACDM halo. 

Next we analyze the phase space structure of halos simu- 
lated in a ACDM cosmology. We see the existence of a slight 
departure from the constant power law behavior at the high / 
end (Fig. |7j. The slope parameter a (Fig. [8j has a minimum at 
around f / fvir = 10 and then it rises reaching a peak at around 
f / fvir = 10*. Beyond this it starts to falls off. 

In order to check whether the power law type behavior of the 
volume distribution function is due to the substructure or whether 
it is associated with the virialization process we simulated a WDM 
halo whose power on small scales has been suppressed and we find 
that it has a steeper slope at the high / end (Fig.|9]l. Its slope param- 
eter a as a function of / is roughly consistent with that of a Hern- 
quist sphere (Fig. llOt . This suggests that the shape of the volume 
distribution function is governed by the amount of substructure and 
its mass function. 
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Figure 6. Effect of truncation on the slope parameter a as extracted from 
a mock Hernquist sphere. The dotted line is the true analytical profile of 
a Hernquist sphere. For a halo whose R^ut = Rvir ™d c = 4.0 the a 
profile can only be extracted till / = fviv The rise in value of a beyond 
this is due to truncation of the halo. The thin dark line is for a WDM halo 
obtained from simulations. 
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Figure 7. Volume distribution function of phase space density, f (/) for 
four halos obtained from ACDM simulations. The values of f (/) for halos 
B,C and D have been shifted by 10,20 and 30 decades respectively for the 
sake of clarity. For reference v{f) oc /~^'^ curve (matched at f / fvir = 
10 )is plotted by a dotted line. An explanation of vertical lines is given in 

Fig.m 
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Table 2. Properties of halos whose phase space stracture is analyzed here: Ncut is the number of particles that lie within a cutoff radius Rcut- These are the 
particles that are used for calculating the volume distribution function f (/) of the halo. 
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Figure 9. The volume distribution function of phase space density v(f) 
for a ACDM and a WDM halo. The WDM profile has been shifted verti- 
cally by 10 decades. An explanation of vertical lines is given in Fig.|5] The 
WDM profile is significantly steeper in high density regions as compared to 
v{f) oc /"^'^ behavior which is indicated by a dashed line. 




Figure 8. The dependence of slope parameter a on / for four halos obtained 
from ACDM simulations. The values of a for halos B,C and D have been 
shifted by 3,6 and 9 respectively for the sake of clarity. An explanation of 
vertical lines is given in Fig. \5\ The dashed line represents the analytical 
profile of the parent + substructure model. The dotted line is the profile 
as estimated by EnBiD for the synthetic realization of the con'esponding 
model. The parameter a does not have a constant value of —2.5 but has 
a dip and rise and is bounded between —2.8 (the asymptotic value of a 
Hemquist sphere) and —2.1 (the value predicted by the AD Toy model) 
which are indicated by horizontal dotted lines. 
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4.1 A Toy Model:Superposition of Sub Halos 

An elegant toy model to explain the near power law behavior of 
the volume distribution functi on of simulated ACDM halos was 
proposed bv lArad et alj ( 12004]) (model AD) In this model the halo 
is assumed to be made up of a superposition of sub halos with a 
given mass function of dn/dm oc each obeying a universal 
functional form for /. The volume distribution function can then be 



Figure 10. The dependence of slope parameter o on / for a ACDM and 
a WDM halo. The behavior of WDM halo profile is in agreement with that 
of a Hernquist Sphere while that of ACDM halo is close to that of a parent 
+ substructure model. The vertical lines mark the position of fstat , f relax 
and fdiscr for the ACDM halo. 
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Figure 11. Dependence of slope parameter a on / as predicted by the toy 
model proposed in Arad et al. 1 2004) (AD) and the subsequent modification 
suggested bv iAscasibar & Binnev. t2005 .) (AB). In the limit the parameter 
'"min — > and parameter rrimax — > oo the AB model goes over to AD 
model. 



written as 



"(/) 



dn 
dm 



Vm{f)dm oc / 



(19) 



where is the mas s of the largest sub halo. However, for 
7 = 1.9, as derived by jPe Lucia etal ] l200l . this mo del predicts 
v{f) o c .f rather than v{f) o c f~^'^ as found in lArad et al.l 
l l2004l) . lAscasibar & BinnevI j2005l) modified this model by point- 
ing out that the lower limit of the integral in Eq. M9\ cannot be 
zero (model AB) since the resolution of the simulation imposes a 
limit on the minimum mass that a sub halo can have. For a halo 
sampled with a finite number of particles each of mass nip the 
minimum mass of a sub halo i s rrim in ~ lOOmp. The analysis 
as done in lAscasibar & BinnevI j2005l) assumes the sub halos to be 
Hernquist spheres and approximates its distribution function by a 
double power law 



Vm{f) 



5.46 X 10-38m3(^)-i-56 / < k/m 

f > k/m 
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-38^ 



-2.80 



(20) 



where it = 3.25 x 10^** M©^ Mpc-='(kms~^)' 
function can then be written as 

,0.54 



v{f) = 3.18({)'^-^ 
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. The distribution 
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v{f) oc 



/-i-se / < fc/m„ 
/ > fc/m„ 



(22) 



In Fig. [TT] we plot the slope parameter a as function of / as pre- 
dicted by the AD and AB Toy models (Eq. It can be seen that 
in the limit the parameter mmin and parameter m,nax ^ oo 
the AB model approaches the AD model. We can see that either 
model fails to reproduce the behavior seen in simulations. 

In both the models it was assumed that the entire halo is made 
up by superposition of sub halos with a mass function g i ven b y 
dn/dm oc m""'. In the analysis done by iDe Lucia et al.i l2004h . 
where this mass function was determined, the background parent 
halo which, which accounts about 90% of the total mass, is ex- 
cluded from the calculation. The parent halo here is not a part of 
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Figure 12. Volume distribution function of phase space density, v(f) as 
predicted by the parent + substructure model proposed here. Curves for the 
parent halo and the sub halos were shifted vertically by two decades for 
claiity. The model 'o{f) shows a slight hump beyond f / fvir = 10^ as 
compared to the constant slope f (/) oc /~^'^ behavior This is the point 
where the subhalo's contribution to v{f) starts to dominate over the parent 
halo's contribution. 
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Figure 13. Dependence of slope parameter a on / as predicted by the 
parent + substructure model proposed here for different values of the pa- 
rameters, sub halo mass fraction f^ub and minimum mass of sub halo 
^min- The profile has a minimum at log(///yir) ~ 1.5 and maxi- 
mum at log(///vir) ~ 3. As fsub increases (keeping rrimin = 10~'*M 
constant) the minimum point of a moves up till it matches with with the 
fsub = 1 AB Toy model. On the other hand as mmin is increased (keeping 
faub = 0.1 constant) the maximum point of a drops down and ultimately 
it merges with the substructure-less Hernquist profile f^ub = 0. 



the substructure population. We take this fact into account and de- 
velop a model in which we account separately for the contribution 
of the parent halo. The halo consists of 1) the parent halo with mass 
(1 ~ fsub)M modeled as a Hernquist sphere and 2) the substruc- 
ture of total mass fsubM which is modeled as a superposition of 
Hernquist spheres with a mass function of dn/dm oc m~^. To cal- 
culate the scale radius a of a sub halo of mass m we use the virial 
scaling relation Alvir oc Rvi^ which gives m oc a'' (assuming 
concentration parameter to be same for all sub halos). In Fig. [12] 
we plot the volume distribution function as predicted by this model 
for fsub ~ 0.1, mmin = 10~''A/. In order to calculate v{f) we 
employ a semi-analytic technique. We generate a sub halo popu- 



Multi-Dimensional Density Estimation and Phase Space Structure Of Dark Matter Halos 1 1 




10"' 10"^ 10° 10^ 10* 10° lO" 



Figure 14. Effect of changing the number of smoothing neighbors on the 
slope parameter a for a WDM halo. Results are shown for kernel smoothing 
with smoothing neighbors n = 40 and n = 10. The slope parameter a for 
a Hemquist sphere and a model with rrirain = 10~^ and /sui, = 0.002 is 
also plotted alongside. 
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Figure 15. The x Vs y and Vx Vs Vi; scatter plot of particles having phase 
space density above lO'^/vir for a warm dark matter halo. In top pan- 
els the density is evaluated by using kernel smoothing with 40 smoothing 
neighbors while in lower panels the density is evaluated using 10 smoothing 
neighbors. 

lation corresponding to the given mass function and mass fraction 
fsuh and then for a given value of / we sum the volume contribu- 
tion of each sub halo along with the parent halo to give the total 
v{f). The v{f) for each sub halo is determined using Eq. ^ and 
Eq. dlOt . The total v[f) as predicted by this model is close to the 
expected v{f) oc f~^'^ behavior but there is a presence of a slight 
hump in the high / part. This is similar to what we saw for ACDM 
halos Fig. [7] In the high / part v[f) is dominated by the substruc- 
ture component the transition being at around f / fvir = 10^. In 
Fig- EH we plot the slope parameter a as predicted by the model for 
various values of f^ub and rrimin- 

In Fig. [8] Q.[f) corresponding to model with parameters 



faub = 0.05, nimin ~ 10~^M is Compared against a(f) for sim- 
ulated halos. It can be seen that the model (analytical profile) is 
successful in qualitatively explaining the behavior of the simulated 
halos (namely the dip and the peak) but there is still some differ- 
ence at the low / end. At the low / end near fvir, parameter a rises 
much more sharply as compared to the model, even after taking the 
truncation effect into account. 

In (Fig. I14t we show the effect of varying the number of 
smoothing neighbors on the a profile of a WDM halo. Lowering 
the number of smoothing neighbors to 10 makes the slope parame- 
ter rise to a peak at the high / end. Since with n = 10 dispersion 
in density estimates is high, this also results in a slight flattening of 
the a profile around / = fvir, where a is found to rise steeply. 
Plotted alongside is the a profile of the best fit parent + sub halo 
model. The a profile of the WDM halo is consistent with a model 
having substructure mass fraction fsub = 0.002. In Fig.[T5]we plot 
the particles having f /fvir > 10'' in both real and velocity space. 
In top panels the density was estimated using n = 40 neighbors, 
while for lower panels the density was estimated using n = 10 
neighbors. It can be seen from the figure that warm dark matter halo 
is not completely free from substructure. More substructure is re- 
solved using smaller number of smoothing neighbors. The fact that 
even such a small amount of substructure can be detected demon- 
strates the superior ability of the estimator in resolving the high 
density regions. It also suggests that the slope parameter a plotted 
as a function of / can be used as a sensitive tool to estimate the 
amount of substructure and and the mass function of sub halos. 

To further check the efficiency of the code in reproducing the 
phase space density of a system with substructure we generated 
a mock system with fg^b = 0.1 and nimin ~ 10^^, and calcu- 
lated its phase space density using EnBiD. The results are shown in 
Fig.[T6]. The sub halos where distributed uniformly inside the virial 
radius of the parent halo and their center of mass velocity was also 
chosen so as to have a uniform random distribution within a sphere 
of radius Vvir in velocity space. For a system modeled with 10^ 
particles, the phase space structure till / = 10*/vir is successfully 
reproduced by using kernel smoothing with 10 smoothing neigh- 
bors. If 40 smoothing neighbors are used the high density regions 
are poorly resolved. Lowering the total number of particles in the 
system also leads to poor resolution at the high / end. 



5 DISCUSSION & CONCLUSIONS 

We have presented a method for estimation of densities in a 
multi-dimension al space based on binary space partitioning trees 
dAscasibar & Bi nnev 2005). We implement a node splitting cri- 
terion that uses Shannon Entropy as a measure of information 
available in a particular dimension. The new algorithm makes the 
scheme metric free and recovers maximum information available 
from the data with a minimum loss of resolution. In our tests on 
systems whose density distribution is known analytically, we find 
significant improvement in estimated densities as compared to ear- 
lier algorithms. 

We suggest how kernel-based schemes (SPH) or in general 
any metric based scheme can be implemented within the framework 
of the new algorithm: the algorithm EnBiD is used to determine the 
metric at any given point, which has the property that locally the co- 
variance of the data points has a similar value along all dimensions. 
Next we incorporate this metric into kernel-based schemes and use 
them for density estimation. We also show that SPH schemes suffer 
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m < f M ^ 



Hernquist sphere f3uh=0-0 
Model f.„h=0.1 m„i„=10"'M 
Kernel n = 40 N=10° 
Kernel n=10 N=10' 
Kernel n = 40 N=10 = 




10" 
f/fvi, 



10* 



Figure 16. Dependence of slope parameter q on / as recovered by En- 
BiD from a parent + substructure model. The fraction of mass in the form 
of substructure is fs-ub = 0-1. the minimum mass of the substruc- 
ture is mmin = 10~*M, M being the total mass of the system. The 
theoretically expected slope parameter for the above model and for a Hern- 
quist sphere without any substructure is also plotted alongside. For a system 
sampled with lO" particles, the parameter a can be accurately predicted till 
f/fvir = 10'' using kernel smoothing with 10 smoothing neighbors. 



from a bias in their density estimates. We suggest a prescription that 
can successfully correct the bias. 

As an immediate application, we employ this method to ana- 
lyze the phase space structure of dark matter halos obtained from 
N-Body simulation with a ACDM cosmology. We find evidence for 
slight deviations from the near power law behavior of the volume 
distribution function v(f) of halos in such simulations. At the high 
/ end there is slight hump and the low / end there is significant 
flattening. We also analyzed a WDM halo and found that its slope 
parameter profile q(/) at the high / end is consistent with that 
of an equilibrium Hernquist sphere having a very small amount of 
mass (0.2%) in the form of substructure. 

In ACDM halos the contribution to the volume distribution 
function at the high / end is dominated by the presence of signif- 
icant amount of substructure. We devise a toy model in which the 
halo is modeled as a Hernquist sphere and the substructure is mod- 
eled as a superposition of Hernquist spheres with a fixed mass frac- 
tion and a mass function dn/dm oc m~^'^. We demonstrate 
that this reproduces the behavior of «(/) as seen in simulations. 

The behavior of «(/) and a(/) depends upon the parameters 
/sui),inass function dn/dm of sub halos, and mmin the minimum 
mass of the sub halo. Since the mass function of sub halos and their 
fraction /sub depends upon the power spectrum of initial conditions 
and on the cosmology adopted, the phase space structure of the 
halos might have an imprint of cosmology and initial conditions 
which might be visible in the profile q(/). 

Although the simple toy model that we propose here can ex- 
plain the basic properties of the volume distribution function there 
is still some difference at the low / end. The flattening at low / 
end is more pronounced in simulated halos as compared to those of 
model halos, even after taking the truncation effect into account. 
Further improvements on the model described include: The toy 
model assumes that all sub halos obey the same virial scaling rela- 
tion while in simulation there should be slight dependence on the 
time of formation of the sub halo. Moreover the sub halos may be 
tidally truncated and stripped and so their de nsity profile may be 
different from that of a pure Hernquist sphere jHavashi et al.l20()3l : 



iKazantzidis et al.ll2004h . Furthermore, there might be a radial de- 
pendence on the properties of sub halos. A detailed model which 
takes into account these effects might help explaining the phase 
space properties more accurately. 

The issue of universality in the behavior of the volume dis- 
tribution function still deserves further investigation. For the four 
halos that we have analyzed one of them had a nearly flat a(/) pro- 
file and the others showed a characteristic dip at / ~ IQfvir and 
a corresponding rise which peaks at around / ~ ICi* fvir- Larger 
samples of halos need to be investigated in order to put these re- 
sults on a sound statistical basis. The differences that are seen in the 
properties of halos might be due to varying degree of virialization. 
The second concern is regarding the role of numerical resolution on 
the behavior of the volume distribution function. In the model the 
shape of the a{f ) profile depends upon the minimum mass mmin 
of the sub halo used to model the sub halo population. According 
to the model a(/) has a minimum at around f / fvir ~ 10 and 
then it rises to a peak at around f / fvir ^ 10^ whose maximum 
value is determined by the logarithmic slope of the mass function 
and is given be —(4 — 7). Beyond this point increasing the reso- 
lution should make the q(/) reach a plateau and then fall off once 
it reaches the resolution limit of the simulation which occurs ap- 
proximately at freiax/ fvir ~ 10~^ Myir/mp. This suggests that 
a proper convergence study needs to be done to establish the uni- 
versality in the phase space behavior of the halos. At higher resolu- 
tion existence of a behavior different from the toy model suggested 
here would imply that there are some physical processes at work 
which significantly alter the properties of low mass sub halos and 
drive the system towards a universal behavior e.g. the one with a 
constant slope. 

Our analysis here shows that the phase space properties of the 
halos that are roughly consistent with equilibrium spherical models 
with a given density profile in real space. A question of fundamen- 
tal importance is regarding the origin of the universal behavior of 
these density profiles as seen in simulations. A clue to which might 
be found by studying as to how the system approaches equilibrium. 
The evolution of the distribution function of collisionless parti- 
cles is governed by the collisionless Boltzmann equation. Since the 
coarsely-grained distribution function of collisionless particles can 
be measured directly with EnBiD, this offers interesting opportuni- 
ties to study the processes of phase mixing and violent relaxation, 
which help the system to reach equilibrium. It might be interest- 
ing in this context to study the evolution of the volume distribution 
function of the halos with time. 

Another interesting application of this method is to study the 
distribution function of equilibrium systems e.g. a disk that hierar- 
chically grows inside a halo. One can study the distribution function 
of these systems and this can in turn be used to construct equilib- 
rium models. 

Finally we would like to point out a potential improvement in 
the code. If the density distribution in any dimension is linearly in- 
dependent of the other dimensions then this offers an opportunity to 
further improve the density estimates by measuring the density dis- 
tributions in different dimensions separately. The concept of mutual 
information offers one such way to quantify this linear dependence 
or independence. An algorithm can be developed which can exploit 
this feature and improve the density estimates in situations where 
the data offers such an opportunity. 
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APPENDIX A: KEIfNEL DENSITY ESTIMATE 

Ror the so called kernel density estimate (KDE) a kernel W is de- 
fined such that 



(Al) 



The density estimate of a discretely set of N particles at a point x 
is given by 



/9(x) = ^ mi W(xi - X, h) 



while the probability density /(x) is given by 
/(x) = ^$]w^(xi-x,h) 



(A2) 



(A3) 



The smoothing parameter h is chosen such that it encloses 
a fixed number of neighbors N smooth. Assuming spherical sym- 
metry the kernel can be written in terms of a radial co-ordinate u 
only. So me of the popular choices are G aussian function and the B- 
splines jMonaghan & Lattanzio! [l98l) . The later is preferred due 
to its compact support. A d dimensional multivariate bandwidth 
spherical kernel can be written as 



W(x,h) 



where 



a 



and the normalization fd is given by 



1 



W{u)SdU'i-'du 



(A4) 



(A5) 



(A6) 



Sd being the surface of a unit hyper-sphere in d dimensions Vd its 
volume. 

Sd = 27r'*''Vr(rf/2) ; Vd = Sd/d (A7) 

Some popular kernels are given below and their normalizations 
constants fd are listed in Table. IaTI 



WGaussian{u) — exp( — It ) fd — 



r.d/2 



Top-H, 



at{u) 



1 ^ w ^ 1 
otherwise 



Wsp 




. _ J_ 

Vd 

< ii «; 0.5 

0.5 < ii 1 
otherwise 



(A8) 



(A9) 



(AlO) 
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Table Al. Normalization constants for various dimensions 



Dimension 
d 




Normalization 




Spline 


Epanechnikov 


Bi-Weight 


1 


1.3333369 


0.75000113 


0.93750176 


2 


1.8189136 


0.63661975 


0.95492964 


3 


2.5464790 


0.59683102 


1.0444543 


4 


3.6606359 


0.60792705 


1.2158542 


5 


5.4037953 


0.66492015 


1.4960706 


6 


8.1913803 


0.77403670 


1.9350925 


7 


12.748839 


0.95242788 


2.6191784 


8 


20.366416 


1.2319173 


3.6957561 


9 


33.380983 


1.6674189 


5.4191207 


10 


56.102186 


2.3527875 


8.2347774 




Epanechikovi.'^') — 



otherwise 



(All) 



Figure Al. The variance of density estimates, as obtained by kernel 
smoothing using 100 smoothing neighbors, as a function of number of di- 
mensions. The solid lines are calculated using Eq. jA24) while the points 
are the a extracted from a Poisson sampled data by applying kernel smooth- 
ing. 



WBi-Weightiu) — 



(1-' 





< M ^ 1 

otherwise 



For kernels in product form 



l¥(x,h) 



(A 12) 



(A13) 



where Ui — Xi/hi and /i is the corresponding one dimensional 
normalization factor as given by Eq. iA6\ . 



Al Optimum Choice of Smoothing Neighbors 

If f{x) is the estimated probability density of a field f(x) then its 
mean square error (MSE) can be written in terms of its bias P{x) 
and variance a{x). Bias of an estimate is given by 



f3{x) = (fix)) ^ fix) 
while its variance is 
a\x) = {[fix) ~ (fix))]') 
Hence mean square error is given by 

MSElfix)] = {[fix)^fix)Y) 



(A14) 



(A15) 



(A 16) 



= ( [fix) - (fix)) + {fix)) - fix)] (A17) 



a^ix) + /3^ix) 



(A18) 



To get accurate estimates both bias and variance should be small. 
Using the fact that 



Aix ~ Xi)) = N / Aix-x')fix')dx 



1=1 



(A 19) 



the bias and variance of an estimator can be calculated by using 
Eq. l lA3b and expanding fix') as a Taylor series about x. For a d 
dimensional multivariate kernel density estimate,the bias and vari- 
ance are given by 

1,2 r 
Pix) « —Tr[Hfix)] I u''Wdiu)SdU-^du (A20) 



where Hfix) 



a-f 

dxidx . 



is the Hessian matrix of function /(x) 



^ ^smooth 

f {^) 



w!iu)Sdu''-''du 



Wiiu)SdU-^du 



\Wd\\l 



(A21) 

(A22) 

(A23) 
(A24) 



IIVKdjll being the d dimensional norm of kernel function 

W4u). 

Lowering h or equivalently lowering Ngrnooth lowers Pix) 
but increases a (x) . Ideally the optimum choice of N smooth is given 
by minimizing the MSE. The bias /3, which depends on the second 
order derivative of the field, is small for slowly varying fields, hence 
can be ignored. Since cr(a;) oc l/\/ N smooth, the variance increases 
as Nsmooth is decreased. The minimum value of Nsmooth that is 
needed to attain a given value of oix) is the optimum choice of 
number of neighbors. We define this lower limit on o as 0.25/(a::). 
In Fig. lAll (T is plotted as a function of number of dimensions d 
for Nsmooth = 100 (assuming fix) — 1). The variance as ob- 
tained by applying kernel smoothing on a Poisson sampled data 
with Nsmooth ~ 100 is also shown alongside. They are in agree- 
ment. The variance a does not increase exponentially with num- 
ber of dimensions. Hence the optimum number of neighbors also 
do not have to grow exponentially with the number of dimensions. 
This means that even in higher dimensions kernel smoothing can be 
efficiently done employing a small number of neighbors. In higher 
dimensions the efficiency of the nearest neighbor search algorithm 
is the main factor which determines the time required for kernel 
density estimation. It can also be seen from Fig. lAll that for a fixed 
number of neighbors the spline kernel gives maximum variance 
while the Epanechnikov kernel gives the lowest variance. Eq. l lA24t 
can be used to calculate the number smoothing neighbors Nsmooth 
required to achieve a given a, for any given kernel in any arbitrary 
dimension. For density estimation with an Epanechnikov kernel in 
6 dimensions, Nsmooth ~ 32 gives a variance of a = 0.22 which 
is equivalent to a variance of 0.1 dex. 
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A2 Fraction of Boundary Particles 

For a system of A*' particles uniformly distributed in a spherical 
region in a rf dimensional space the fraction of particles that lie 
on the boundary increases sharply with the number of dimensions 
d. If I is the mean inter-particle separation then / = {Vdv'' /N)^^"^ 
and the fraction ft is given by 

fb = {Sy/l^)/{Vy/f')=dl/r = d{Va/Ny/'' 

For iV = W'^, the fraction ft is 0.05 and 0.79 for d = 3 and d = 6 
respectively. 



A3 Anisotropic Kernels 

For planar structures which are not parallel to one of the co-ordinate 
axis one needs to adopt an anisotropic kernel to get accurate results. 
This is equivalent to a transformation with a rotation and a shear 
which diago nalizes the covarian ce matrix and then normalizes the 
eigenvalues ( Shapir o et alj 1996 1. Let H be a diagonal matrix such 
that Hii = hi and x' = i/~^x. If C(x') is the covariance matrix 
locally at point x' then the kernel is given by 



W(x,h) 



\D\^/^\H, 



Wd {\D~'^''^EH'^yi 



(A25) 



where E is the eigenvalue matrix that diagonalizes C and D is the 
corresponding diagonal eigenvalue matrix. To keep the number of 
smoothing neighbors roughly constant we normalize the eigenvalue 
matrix, D D/jDj^/'', this preserves the smoothing volume. To 
identify the neighbors that contribute to the density at x one now 
needs to select a spherical region with radius h' = h max(D"'^'''^). 



A4 Bias in Spline Kernels 

Spline kernels have a bias in their estimated densities i.e. they sys- 
tematically overestimate the density. This is not present for a reg- 
ularly distributed data like a lattice or a glass like configuration 
where the inter-particle separation is constant 0. This only occurs 
for a data which has Poisson noise and whose density is measured 
at the location of the data points. In some sense the bias is due to 
evaluation of the density at the location of Poisson peaks in the 
density distribution. The smaller the distance from the center the 
greater the weight of the kernel. When the density is estimated at 
the location of the particle the kernel assigns a very high weight to 
this particle since its distance is zero. Below is shown a simple cal- 
culation which demonstrates the bias in a spline kernel as compared 
to a top hat kernel which is free from such bias. 



ft 



mWr 



(A26) 

(A27) 
(A28) 



Assuming that the top hat kernel gives the correct density ft = 
k/iVdh''). Taking one particle out from the smoothing region 



Bios Correction ON 
Bias Correction OFF 



Correction OFF: 
=x> = -0.016 1 

(7, =0-151 : 

Correction ONF 
Cx> = 0.212 : 

(7, =0.100 : 



Bios Correction ON 
Bias Correction OFF 



' Smooth : 

Correction OFF 
<x> = 0.003 1 

CT, = 0.100 : 

Correction ONF 

<x> = 0.043 : 

(7, = 0,096 : 



og{t/f,) 



= log(f/f,) 



Figure A2. Kernel density estimates with and without bias correction, 
are shown for a system of N = 10^ particles distributed uniformly 
in a 6 dimensional space with periodic boundaiies. Probability distribu- 
tion P{log{f / ft)) is plotted for spline kernel with smoothing neighbors 
n = 64 (left panel ) and Epanechnikov kernel with n = 32 (right panel). 
The mean < x > and dispersion ax of the best fit Gaussian distribution to 
X = log{///t), is also shown alongside. 



should roughly give a density of '^iJl mWi — m(k ~ 1)/Vdh'^ 



mWr=o + {k - l)m/{Vdh'^ 



= 1 + 



km/{Vdh'i) 
fdVd - 1 



(A29) 

(A30) 
(A31) 



It can be seen from Eq. iA3U that the bias decreases when 
the number of smoothing neighbors k is increased. This bias can 
be removed by displacing the central particle having r = to r = 
hd/ (1 + d), h being the radius of the smoothing sphere, and d the 
dimensionality of the space. This corresponds to the mean value 
of radius r of a homogeneous sphere in a d dimensional space . 
This correction should only be applied if the distribution of data is 
known to be irregular. 

In Fig. lA2l kemel density estimates with and without bias cor- 
rection are shown for a system of N = 10^ particles distributed 
uniformly in a 6 dimensional space with periodic boundaries. In left 
panel the probability distribution P(log(///t)) is plotted with and 
without bias correction, for kernel density estimate obtained using 
a spline function and smoothing neighbors n = 64. In right panel 
the probability distributions are plotted for kernel density estimates 
obtained using an Epanechnikov function and smoothing neighbors 
n — 32. The bias given by mean < a:: > of the best fit Gaussian 
distribution is also plotted alongside. According to Eq. l |A3U . in 
a 6 dimensional space for spline kernels with neighbors k — 64 
the bias is < log(/sp//t) 0.21 and for Epanechnikov kernel 
with fc = 32 the bias is < log(/Bp//t) >= 0.04. These values 
are close to those shown in Fig. I A2 1 for uncorrected estimates. The 
Epanechnikov kernel function has less bias than the spline kernel 
function. After correction, for both the kernels, the bias is consid- 
erably reduced. 



^ This bias does not affect the results in SPH simulations beca use the par- 
ticles are not distributed randomly but rather by the dynamics i Monaghai] 
[l992). The dynamics of the pressure forces results in a configuration which 
is regular and with nearly constant inter-particle separation. 



